Multiple Approximate String Matching

نویسندگان

  • Ricardo A. Baeza-Yates
  • Gonzalo Navarro
چکیده

We present a very simple and eecient algorithm for on-line multiple approximate string matching. It uses a previously known counting-based lter 9] that searches for a single pattern by quickly discarding uninteresting parts of the text. Our multi-pattern algorithm is based on the simulation of many parallel lters using bits of the computer word. Our average complexity to search r patterns of length m is O(rn log m= log n), being n is the text size. We can search patterns of diierent length, each one with a diierent number of errors. We show experimentally that our algorithm is competitive with the fastest known algorithms, being the fastest for a wide range of intermediate error ratios. We give the rst average-case analysis of the ltering eeciency of the counting method, applicable also to 9].

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Average-Optimal Multiple Approximate String Matching

We present a new algorithm for multiple approximate string matching, based on an extension of the optimal (on average) singlepattern approximate string matching algorithm of Chang and Marr. Our algorithm inherits the optimality and is also competitive in practice. We present a second algorithm that is linear time and handles higher difference ratios. We show experimentally that our algorithms a...

متن کامل

Average-Case Optimal Approximate Circular String Matching

Approximate string matching is the problem of finding all factors of a text t of length n that are at a distance at most k from a pattern x of length m. Approximate circular string matching is the problem of finding all factors of t that are at a distance at most k from x or from any of its rotations. In this article, we present a new algorithm for approximate circular string matching under the...

متن کامل

Improved Single and Multiple Approximate String Matching

We present a new algorithm for multiple approximate string matching. It is based on reading backwards enough `-grams from text windows so as to prove that no occurrence can contain the part of the window read, and then shifting the window. Three variants of the algorithm are presented, which give different tradeoffs between how much they work in the window and how much they shift it. We show an...

متن کامل

Approximate String Matching: Theory and Applications (La Recherche Approchée de Motifs : Théorie et Applications)

The approximate string matching is a fundamental and recurrent problem that arises in most computer science fields. This problem can be defined as follows : Let D = {x1, x2, . . . xd} be a set of d words defined on an alphabet Σ, let q be a query defined also on Σ, and let k be a positive integer. We want to build a data structure on D capable of answering the following query : find all words i...

متن کامل

Approximate Multiple Pattern String Matching using Bit Parallelism: A Review

String matching is to find all the occurrences of a given pattern in a large text both being sequence of characters drawn from finite alphabet set. Approximate String Matching involves the detection of correct patterns along with the detection of some wrong patterns inside the text. Bit Parallelism is a feature that can be used to detect patterns inside the text and is reported to result in mor...

متن کامل

String Matching and its Applications in Diversified Fields

String searching algorithms, sometimes called string matching algorithms, are an important class of string algorithms that try to find a place where one or several strings (also called patterns) are found within a larger string or text.[11] String matching is a classical problem in computer science. In this paper we are trying to explore the various diversified fields where string matching has ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997